Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: implement at #53

Merged
merged 8 commits into from
Jan 3, 2025
Merged

ENH: implement at #53

merged 8 commits into from
Jan 3, 2025

Conversation

crusaderky
Copy link
Contributor

@crusaderky crusaderky commented Dec 9, 2024

pyproject.toml Outdated Show resolved Hide resolved
tests/test_at.py Outdated Show resolved Hide resolved
tests/test_at.py Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
@lucascolley lucascolley marked this pull request as ready for review December 9, 2024 14:30
@lucascolley lucascolley marked this pull request as draft December 9, 2024 14:31
@crusaderky
Copy link
Contributor Author

Rebased on top of #58

@crusaderky
Copy link
Contributor Author

crusaderky commented Dec 11, 2024

  • Reverted rebase onto Saner linters #58
  • Fixed all linters issues
  • Added unit tests for all backends
  • Removed **kwargs. Their main original appeal was allowing to pass performance tweaks to JAX, which are indices_are_sorted=True and unique_indices=True. However, I have now realised that the array API specification does not support integer array indices, at all, and both of these parameters are pointless without.
  • I'm having big second thoughts on get() - read below.

@crusaderky
Copy link
Contributor Author

crusaderky commented Dec 11, 2024

Any idea how to fix this? Looks like the ubuntu_latest VM has an obsolete driver (or more likely no driver)

ERROR cupy_backends.cuda.api.runtime.CUDARuntimeError: cudaErrorInsufficientDriver: CUDA driver version is insufficient for CUDA runtime version

@lucascolley
Copy link
Collaborator

I don't think we can get GPU CI without paying someone for it, cc @rgommers .

@crusaderky
Copy link
Contributor Author

I don't think we can get GPU CI without paying someone for it, cc @rgommers .

Can cupy run on a CPU-only host?
I've disabled it in CI for the time being

@rgommers
Copy link
Member

I don't think we can get GPU CI without paying someone for it, cc @rgommers .

Yep, that's on my radar to push forward this month, on multiple projects. Please feel free to open a new issue and assign it to me. I think we can hook up a shared GPU runner between this project and array-api-compat.

Can cupy run on a CPU-only host?

I don't think so.

@lucascolley lucascolley mentioned this pull request Dec 26, 2024
8 tasks
@crusaderky
Copy link
Contributor Author

@lucascolley ready for merge and release! 🥳

@lucascolley
Copy link
Collaborator

good to proceed @rgommers ?

@lucascolley
Copy link
Collaborator

looks great, thanks! Note to self: we ought to add

# sphinx-copybutton configurations
copybutton_prompt_text = r">>> |\.\.\. |\$ |In \[\d*\]: | {2,5}\.{3,}: | {5,8}: "
copybutton_prompt_is_regexp = True

to docs/conf.py.

@crusaderky
Copy link
Contributor Author

crusaderky commented Jan 2, 2025

Annoyingly, codecov is glitching (again). It's stating in the "Files" view of this PR that the tests never run under JAX, but if you follow the link from the log of Check ci-backends it says that everything's green.

@lucascolley
Copy link
Collaborator

yeah, happy to just ignore that, but also happy if anyone figures out how to fix that

@rgommers rgommers self-requested a review January 2, 2025 20:38
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @crusaderky and reviewers. This is looking quite good. A couple of comments (disclaimer: it's a little hard to dig through the already-resolved review comments, so I may be saying something that was already discussed there):

(1) The copy=True default seems wrong to me. The main point of this feature is that we can replace in-place operations like x[idx, :] += y in SciPy et al. with something new to support JAX. That something new should be the idiomatic JAX way, so x = xpx.at(x)[idx, :].add(y). This should come with as little of a performance penalty for NumPy usage as possible - hence the copy=True default doesn't work. It's already a significant concession to code readability to have to avoid += & co, and having to also add copy=None in all places where performance matters seems too much.

(2) When playing around with this branch, most things work as expected, but this was a little surprising:

>>> x = np.arange(5)
>>> x = xpx.at(x)[[3, 4]].divide(1.5)
>>> x.dtype
dtype('int64')
>>> x = np.arange(5)
>>> x /= 1.5
Traceback (most recent call last):
  Cell In[32], line 1
    x /= 1.5
UFuncTypeError: Cannot cast ufunc 'divide' output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

Is this on purpose or by accident? Raising on these cases may be cleaner if that can be done in a performant way.

(3) Library support is pretty comprehensive already; a few others come to mind. MLX is already supported by the generic path since it allows in-place syntax (I think, didn't verify). It also has array.at which matches JAX syntax/semantics. ndonnx is going to need explicit support in the future as well, since it's immutable like JAX, but that can be left for later since it doesn't seem to have Array.at or another alternative here. tl;dr nothing to change in this PR I think.

Allow for the alternate syntax ``at(x)[start:stop:step]``.

It looks prettier than ``at(x, slice(start, stop, step))``
and feels more intuitive coming from the JAX documentation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be the default way of doing things in the examples? Best to steer people to what feels idiomatic to JAX users I'd think. I'd de-emphasize, or perhaps even leave out completely (if there are no downsides to that), the alternative syntax.

Copy link
Contributor Author

@crusaderky crusaderky Jan 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been writing several prototype PRs on various submodules of Sphinx.
From the experience I gathered there, I must say that at(x)[idx] feels more natural and easier on the eyes when there are multiple indices and/or slices, whereas at(x, idx) feels nicer and less cluttered in all other cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, fair enough - seems like an argument to keep both then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I've been considered writing aliases for the benefit of readability set_at(x, idx, value) etc. but haven't gone for them for the sake of keeping the API surface small.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not adding a third way sounds like a healthy idea at least for now - let's see how we like this in its current form I'd say.

@crusaderky
Copy link
Contributor Author

(1) The copy=True default seems wrong to me.

Changed it to copy=None

(2) x /= 1.5

TIL that

  • x[idx] = x[idx].__add__(value)
  • x[idx] = x[idx].__iadd__(value)

are almost always equivalent!

Fixed.

tests/test_at.py Outdated Show resolved Hide resolved
@lucascolley
Copy link
Collaborator

shall we get going with a release? :)

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ready to me now, +1 for merge & release if everyone else is happy too.

@lucascolley
Copy link
Collaborator

let's give this a whirl, thanks all! if you can point the SciPy PR to the release when it is out @crusaderky, I can update the sklearn PR

@lucascolley lucascolley merged commit 84bf725 into data-apis:main Jan 3, 2025
10 checks passed
@crusaderky crusaderky deleted the at branch January 3, 2025 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request new function
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants